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A NOTE ON THE MAXIMUM OF THE RIEMANN ZETA 
FUNCTION, AND LOG-CORRELATED RANDOM VARIABLES 

ADAM J HARPER 



Abstract. In recent work, Fyodorov and Keating conjectured the maximum size of 
|C(l/2 + it)\ in a typical interval of length 0(1) on the critical line. They did this 



f^ I by modelling the zeta function by the characteristic polynomial of a random matrix; 

^^ ' relating the random matrix problem to another problem from statistical mechanics; 

O ' and applying a heuristic analysis of that problem. 

•^ ' In this note we recover a conjecture like that of Fyodorov and Keating, but using a 

^\i ■ different model for |C(l/2 + it)\ in terms of a random Euler product. In this case the 
probabilistic model reduces to studying the supremum of Gaussian random variables 

f-H , with logarithmic correlations, and can be analysed rigorously. 



1. Introduction 



When ^{s) > 1, the Riemann zeta function can be expressed as an absolutely con- 
vergent Euler product: 

^: cw- n (i4 ' 



1^ . ■ \ p 

\^^ p prime ^ 

Q ■ Thus Cl'^) 7^ when ^(s) > 1. For other s we do not have such a nice expression for 

■^ I C{s), and its behaviour remains quite mysterious. However, one can still approximate 

Si ({s) in useful ways. Gonek, Hughes and Keating [5] showed, very roughly speaking, 

that if the Riemann Hypothesis is true then 

c(i/2 +u)^ii(i- -^) ' • n (^^(^ - 7) logx), 

P<X V -f^ / |7-t|<l/logX 

on a wide range of the parameter X (relative to t). Here c > is a constant, and the 
second product is over ordinates 7 of zeros of the zeta function. 

The ordinates 7 and the numbers p** are difficult to analyse rigorously, but it is 
widely believed that they behave like certain random objects. One might imagine that 
the p** behave, for most t, like independent random variables Up distributed uniformly 
on the unit circle, and Selberg's central limit theorem for log|C(l/2 + it)\ provides 
rigorous support for that belief. The correct model for the 7 is believed to be the 
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eigenvalues of a random unitary matrix. Inserting these random objects in the "hybrid 
Euler-Hadamard product" above, one obtains (for any choice of X) a random model 
for ({1/2 + it). Farmer, Gonek and Hughes [2] analysed the likely (i.e. with probability 
1 — o(l)) behaviour of the maximum of T independent copies of that random model, and 
so were led to a conjecture about maxo<t<T |C(l/2 + it)\. It turns out that one derives 
the same conjecture for a wide range of choices of X. 

More recently, Fyodorov and Keating |1] considered a short interval version of the 
maximum, namely m.axT<t<T+2Tr |C(l/2 + it)\. See also the announcement [3] by Fyo- 
dorov, Hiary and Keating. Amongst other things, they make a conjecture that may be 



[ary 

J 



interpretecu as follows: 

Conjecture 1 (Fyodorov and Keating, 2012). Let e > 0, and suppose that Ti is large 
enough depending on e. Then for all Ti < T < 2Ti, except for a set of "bad" T of 
measure at most eTi, we have 

max |C(l/2 + zt)| = exp(loglogT - (3/4) logloglogT + 0,(1)), 
where 0,(1) denotes a quantity bounded in terms of e. 

As well as being interesting in itself, Fyodorov and Keating [1] observe that their con- 
jecture is easier to investigate numerically than the conjecture about maxo<t<T |C(l/2 + 
it) I , because one only needs to calculate the behaviour of the zeta function in several 
randomly chosen intervals of length 2n, rather than in a complete interval of length T. 
See Fyodorov and Keating's paper for a discussion of such numerical experiments. 

As in the work of Farmer, Gonek and Hughes [2] on maxo<t<T |C(l/2 + ^^)|, and in 
huge amounts of previous work on e.g. the moments J^ |C(l/2 + it)\'^^dt of the zeta 
function, Fyodorov and Keating are led to their conjectures by modelling |C(l/2 + it)\ 
using the characteristic polynomials of random matrices. More precisely, they speculate 
that TciayiT<t<T+2-K |C(l/2 + it)\ will typically behave in the same way as 

max \pn{0)\, 

0<e<27r 

where Pn{(^) is the characteristic polynomial of a random N x N unitary matrix, and A^ 
is the integer closest to log T. However, it seems difficult to precisely analyse the random 
variable maxo<6i<27r \pn{(^)\ in a rigorous way. As Fyodorov and Keating explain, when 
Farmer, Gonek and Hughes [2] encounter maxo<e<27r \pn{0)\ they only require some in- 
formation about its extreme tail behaviour, because they then take the maximum of T 



Fyodorov and Keating [4] actually make an even more precise conjecture, about the distribution of 
logmaxT<t<T+27r |C(l/2 + it)\- log log T + (3/4) log log log T as T varies. 
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independent copies. Fyodorov and Keating [1] require precise distributional informa- 
tion, and end up relying on a (very interesting) heuristic analysis based on comparing 
maxo<e<27r \PN{d)\ with some statistical mechanics problems. 

In this note we investigate YnaxT<t<T+2TT |C(l/2 + it)\ using a random Euler product 
model, that roughly corresponds to choosing X as a power of T in the hybrid Euler- 
Hadamard product. The conjecture we derive turns out to be essentially the same as 
Conjecture 1, and the probabilistic analysis of this model can be performed rigorously, 
which hopefully serves as additional evidence in favour of Conjecture 1. 

First we show that, except on a set of small measure (essentially consisting of points 
very close to zeta zeros), log|C(l/2 + it)\ is very close to ^(X]p<T ■ i/a+'t To t )• ^^ 
do this by adapting an argument of Soundararajan [llj that gave an upper bound for 
log|C(l/2 + it)\, which is valid for all t and which we will also need. These results 
assume the Riemann Hypothesis, but this seems acceptable in our heuristic context. 

Proposition 1 (Adapted from the Main Proposition of Soundararajan [llj). Assume 
the Riemann Hypothesis is true, and let T be large. Then for any T < t < 2T we have 

1 \r(^ /9m;M < ^<^ ^ ^o^^'Tlv) , ^ (1/2) log(T/p^) , , ^,^, 

p<T ^ ^ p2<T ^ ^ 

Moreover, there exists a set H, ^\T^T ^ 1'n\, of measure at least 1.997r, such that 

log|C(l/2 + .t)| = 3i($:-^i^gM) + 0(l) ^ten. 

We will prove Proposition 1 in §2. 

The other part of our analysis is the following probabilistic result, which we shall 
discuss further in just a moment. 

Proposition 2. Let T be large. Let {Up)p<T be independent random variables, each 
distributed uniformly on the unit circle in C Then 

J max mT "^ '''^^™ + - T ^S MLIfl) 



< loglogT- ( 1 /4) log log log T + O ( v^log log log T) 

is 1 — o(l) as T ^ oo. 

Moreover, ifH '^ [0,27r] is any fixed set of measure at least l.QQvr (say) then 

P(max^(^ 1%/T^^T^^ ) ^ loglogT-21ogloglogr-0((logloglogT)3/^)) = l-o(l) 

hen ^ — / pi/^+m loffi 
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As discussed previously, it seems reasonable to assume that for a "typical" value of T 
the set of values (p*"^)p<T will behave, in an average sense, like {Up)p<T- Thus it seems 
reasonable to assume that, for a typical value of T, [^{J2p<T i/z+^t To r )) 

will behave in the same way as ( ^i^p<T ■ i/2+»fa °fo t ) ) • Combining this as- 

sumption with Propositions 1 and 2, we are led to the following conjecture: 

Conjecture 2. Let e > 0, and suppose that Ti is large enough depending on e. Then 
for all Ti <T < 2Ti, except for a set of "bad" T of measure at most eTi, we have 

exp(loglogT — (2 + o(l)) log log log T) 
< max |C(l/2 + it)| <exp(loglogT- (l/4 + o(l))logloglogT) 



Thus, as claimed, we were led to essentially the same conclusioiu as Conjecture 1. If 
we could prove a more precise version of Proposition 2 then a more precise version of 
Conjecture 2 would follow. 

We conclude this introduction with some remarks about Proposition 2. li \hi — h2\ < 
1/logT then the random variables 

Up iog(r/p), ^„, >_„,v- Up iog(r/p), 



•^(".)-»(E^I^T^). A-(,.):=»(5: 



p<T ^ ^ p<T ^ *= 

are almost perfectly correlated, since p^'^^ ~ p*^^ for all p < T. When < hi,h2 < 27i 
are further apart we have 



VYih\Yih\ ^ miUp^pr'^mUp.P^'''') log(TM) log(T/p2 

^x{h,)x{h,) - 2. jj^^ —2^^ 

pi,V2<T Pi P2 '"^S 

cos((/ii - /la) logp) log^(T/p) 



^E 



2,^ p i°r^ 

^ (l/2)log(l/|/ii-/i2|), 

where the second equality follows by writing ^{Upp^''^) = (l/2)(f/pp^*^ + Upp''^) (and 
noting that KUp-^^Up^ = for all Pi,P2), and the third line from standard estimates for 
sums over primes. See the first appendix, below, for a precise calculation. Unsurprisingly 
given the origin of the X{h), this logarithmic type of covariance structure matches 
the two-point correlation function of log |C(l/2 + i(T + h))\, as discussed in Fyodorov 
and Keating's paper [1]. Such covariance structures also appear in connection with 



It is known that log\pN{0)\ can be expressed as a trigonometric series in 9, whose coefficients are 
random variables any finite number of which have independent Gaussian limiting distributions (as 
A^ — > oo). See Diaconis and Shahshahani's paper [1], and also the discussion in Fyodorov and Keating's 
paper [4]. Thus it may not be too surprising that Conjecture 2, which is derived from studying random 
trigonometric sums, matches Conjecture 1, which reflects the presumed behaviour of log |p7v(^)|- 
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branching random walks, with the Gaussian Free Field, and in general with random 
variables indexed by trees. See Zeitouni's lecture notes [13] for some discussion of the 
former problems, and also see the references cited by Fyodorov and Keating. 

The upper bound in Proposition 2 is, in principle, quite straightforward to obtain, but 
complications arise because the maximum is taken over an interval rather than a discrete 
set, and in trying to obtain the subtracted term —(1/4) log log log T. We will prove it in 
§3.1. The lower bound is harder, but one can use a quantitative form of the multivariate 
central limit theorem to replace the random variables (3f?(Xlp<T ^Ta+ih-^^lrr') ) 
by Gaussian random variables with the same means and covariances, and then use a 
general lower bound inequality from the author's paper j6]. See §3.3, below. In fact that 
paper studies random variables very much like the X{h), in connection with a problem 
of Halasz on random multiplicative functions. (In [B] the Up are replaced by real random 
variables, and the range of /i is a little different, but otherwise the situations are very 
similar.) The author hopes that the exposition here will be a useful supplement to that 
in [6] , where most of the focus was on establishing the basic inequality rather than the 
application. 

As Fyodorov and Keating |1] briefly discuss, (and see §3.1, below), if the values 
log|C(l/2 + i(T + h))\ behaved "independently" when \hi — /i2| ^ 1/logT then the 
subtracted term —(3/4) log log log T in Conjecture 1 would be incorrect, and instead 
one would have a subtracted term —(1/4) logloglogT. The term —(3/4) logloglogT is 
believed to be a manifestation of the logarithmic correlations of the log |C(l/2 + i{T + 
h))\ (and the corresponding random models). Thus it seems an interesting problem 
to sharpen Proposition 2, and to rigorously analyse the random matrix model that 
motivated Fyodorov and Keating. It is also conceivable that one could obtain rigorous 
results, in the direction of Conjecture 1, about the zeta function itself. For example, it 
is well known that if T < t < 2T then C(l/2 + it) = Y.n<T 1/^^^^+'* + 0(1/VT), and so 

2 

+ o(i)<riiog2ri. 






Emax |C(l/2 + 2t)P< \ max 

n<t<n+2-K ^-^ n<t<n+2TT 

Ti<n<2Ti Ti<n<2Ti 



A^ ^l/2+Jt 
n<Ti 



using a discrete mean value result for Dirichlet polynomials (as in e.g. Theorem 5.3 of 
Ivic [7]). This implies that m.axT<t<T+2n |C(V2 + it)\ = Oe(logT), except for a set of 
bad Ti <T < 2Ti of measure at most eTi. However, it is not clear how to obtain any 
information at the level of the logloglogT corrections in Conjectures 1 and 2. 

2. The number theoretic part : Proof of Proposition 1 

The first part of Proposition 1, giving an upper bound for log |C(l/2 + it)|, is a special 
case of the Main Proposition of Soundararajan [TTj (choosing x = T and A = 1 there). 
Thus we will just prove the second part of the proposition. 
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Let C, i^ > be large absolute constants, suitable values of which could be extracted 
from the following arguments. We will begin by showing that if T < t < T + 27r satisfies 

t — 7r 
where 7 denotes the ordinates of non-trivial zeta zeros, then 

iogic(i/2.a)i^^^ (^'»)/:°f»)'°^j(:') -.o(i). 

n<T ^ 

where A(n) denotes the von Mango Idt function. Afterwards we will show that the sum 
over 7 is indeed suitably small for most T < t < T+2tt, and that 5R ^n<T 1/2+^t °fo t 
can be replaced by 3f? J2p<T i/2+i°^ for most such t. 

Since we assume that ^ l/|t — 7^ < Clog^T, and we are always assuming that 
T + 27r > t > T is large, we may assume that the zeta function has no zeros or poles on 
the horizontal line extending from 1/2 + it to (positive) infinity. Thus we have, assuming 
the Riemann Hypothesis, the following estimates: 



\n<T 



lo.lC(l/2-..)| ^ ^IL^ '"ty-i°f"''tT -i^f(l/2.-:^). 



1 y^ / T'/2+h— 



7 



logT^A/2 (1/2 + Z7 - a - 2t)2 \\ogT 



3^^(1/2 + tt) = -(1/2) logT + 0(1). 

These are essentially obtained on pages 4 and 5 of Soundararajan's article [11], the first 
by integrating an explicit formula for ('/( from 1/2 + it to positive infinity along a 
horizontal line, and the second by taking real parts in the Hadamard (partial fraction) 
formula for C'/C- Inserting the second estimate in the first, we deduce that 



n<T 



The integral here is equal to 1/logT, so if we have ^ l/|t — 7P < Clog^T then the 
third term is 0(1), as claimed. 

Next we define two "good" sets ^(2) c ^(i) C [T,T + 27r], by 

gW := {T < t < T + 27r : |t - 7I > l/{K\ogT) V7}, 

^(2) := {t G ^(1) : J2 y\t- 7l' < (C/2) log^ T}, 

r-l<7<T+7 
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where 7 continues to run over ordinates of zeta zeros. We see 



= 2K\ogT Y. 1' 

r-l<7<T+7 

and using standard estimates for the number of zeta zeros in a horizontal strip (as in 
e.g. Theorem 10.13 of Montgomery and Vaughan P) we find this is ^ Klo^T. Thus, 
provided C was set sufficiently large in terms of K^ the measure of Q^^'>\Q^'^'> is at most 
O.OOlvr. Similarly, provided K was set sufficiently large the measure of \T^T + 2'k]\Q^^'> 
is at most O.OOlvr (since there are -C logT zeros in that interval). If t G ^"-^^ then 
E7 1/1^ - 1? < Et-i<7<t+7 VI^ - 1? + E7 10/(1 + |t - 7n < C log2 T, and so we 
have this desired bound for all T < t < T + 27r, except for a "bad" set of measure at 
most 0.0027r. 

Finally we note that, since Yllp<^/T^'^^P) Ip ^ log^? "we have 
^ (A(n)/logn)log(T/n) ^ ^ 1 log(T/p) ^ 1/2 log(T/p^) 

. iog(r/p) ^j/^ 

„i/2+it loffT -^ „i+2it ^^v ;> 

so it will complete the proof of Proposition 1 if we can show that 



Z^ r,l/2- 



meas{T < t < T + 27r : 



/ > jjl+2it 

p<Vt 



>C}< O.OOStt. 



2 



But this is an easy consequence of the fact that J^ '^ J2p<^/T ^r+m dt <^ 1, which follows 
from a suitable form of Plancherel's identity (as in e.g. equation (5.26) of Montgomery 
and Vaughan [H]). 

Q.E.D. 

3. The probabilistic part : Proof of Proposition 2 
3.1. The upper bound. For the sake of concision, let us write 

where 1 denotes the indicator function. To prove the upper bound in Proposition 2, we 
need to show that there exists an absolute constant C > such that 

P( max ^ Y^ \i^ ' > log log T - (1/4) log log log T + C ^log log log T) = o(l) 

'^ P<T P 
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as T — 7- CXD. 

Using the union bound, this probabihty is certainly at most 

Y^ fI max^^^ ^ Y^ —^TJ^ > ^^g log T - (1/4) loglog log T + CVlogloglogT j , 

0<j<27rlogT \WT^^-\^iT p<T ^ J 

and since UpP~^^^^°^'^ has the same distribution as t/p, for any j, this is 
<logT-P| max ^ V ^^' ' > log log T - ( 1 /4) log log log T + C v^log log log T 

So it will suffice to show that the probability above is o(l/logT) as T — )■ oo. 

The random sums 3fJ X]p<r 1/2 ; ^ 'I2p<T 1/2 are almost perfectly correlated when 
1^1 — ^2! < ^/^ogT, so one might expect the probability above to be roughly the same 
as P hd Y.p<T ^^ > log log T - (1/4) log log log T + C^log log log t) . This is essen- 
tially true, and in fact we will shortly prove the following result: 

Lemma 1. IfTis large, and C > is a sufficiently large absolute constant, then 

P 



(max 9fJ > — -^ — > log log T — (1/4) log log log T + C\/\og log log T 
< fI^Y "^-^T^ > log log T - (1/4) log log log T + ^J\og log log T ] +0 



1 



log T log log T ' 



Now we need a bound for the probability that 3fJ X]p<r ^(P? 0)/p^''^; which is simply a 
sum of independent random variables, is large. There are many such bounds available, 
but the next result, which follows from Theorem 3.3 of Talagrand [12] (by choosing 
u = t/a'^ to bound the infimum of characteristic functions there), is slightly sharper 
than most of those and will allow uqj to have the subtracted term —(1/4) log log log T. 

Tail Inequality 1 (Talagrand, 1995). There exists an absolute constant K > such 
that the following is true. Suppose Xi,...,X„ are independent, mean zero random 
variables, and suppose B > is such that |Xj| < B almost surely, for all i. Set 
a^ := E{J2i<i<n^iy = 'Ei<i<n^Xf- Then for anyO<t< a^/KB we have 

P( 5^ X, > t) « ^ I exp (-tV(2a^) + 0(|t/aT ^ ^l^^l') ) " 

l<j<n \ / ) \ l<i<n J 



Recall that if Z is a standard normal random variable, and z is large, then P{Z > z) <$:. {l/z)e~^ 1"^ . 
Most tail probability bounds for sums of independent random variables have an exponential component 
like this, but do not include the multiplier 1/z. Talagrand's result supplies such a multiplier, which 
will let us obtain a probability bound o(l/logr) at a slightly lower threshold. 
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Applying Tail Inequality 1 to '^ioooK'2<p<T^i^ (P^^) /P^^"^) ^ ^^ ^^^ ^^^^ ^^ ^^^ ^^^^ 
B = 1/{10K) (say), and we have 



K^<p<T ^ 1000K^<p<T^ ^ ^^ 

= (1/2) J2 - + 0(l) = (l/2)loglogT + 0(l) 



V 
1000K^<p<T ^ 



and J2ioooK^<p<T^\^iV{p,0)/p^/^)f < J2ioook^<p<t'^/p^^^ < 1- See the first appen- 
dix, below, for some similar variance calculations. Thus we have 

P i^J2^-^ljP- > loglogT - (1/4) logloglogT + v/logloglogTJ 

\ p<T P J 

E ^-T^) > log log T - (1/4) log log log T + (1/2) Vlog log log T 

^ -(loglogT- (1/4) logloglogT + (l/2)VlogloglogT)^ 
exp — - — — — — — h 0(1] 




VloglogT "V loglogT + 0(l 

= \ exp (-loglogT + (1/2) logloglogT - ^logloglogT + 0(1)) , 

Vlog log T V / 

provided that T is large enough. This bound is o(l/logT), which suffices to establish 
the upper bound in Proposition 2 provided we can prove Lemma 1. 



To prove the lemma we will use a chaining argument, whereby we approximate 



maxQ<^<^_ 3f?^^^ — ^^^ by the maximum over increasingly sparse discrete sets of 



points h. This argument would be easier if the random sums 5R Ylp<T 1/2 were Gauss- 
ian random variables, and in §§3.2 — 3.3, where we prove lower bounds, we will have to 
use a central limit theorem to pass to that situation. However, here we will make do 
without that step. 

Let Ho := {0}, and for 1 < A; < logT let Hk := {V(2^1ogT) : < i < 2''} (so that 
"Hfe-i C Tik V/c), and note that we certainly have 



max "^ > — -^ — = max 3f? > — -^ h 0(1) 



< n,a. S( V ^)+ ma. S( V ^) + 0(1). 

<P<T 

We split into two pieces here because, as the reader will soon see, the part of the sum 
over small primes is more highly correlated at short distances (and also contributes most 
of the expected size of the maximum), and so it is easier to handle. Now given a point 
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h G l-L[\ogT], define h'^''^ := max{t < h : t G T-ik}-, and note that 3f?X]p<Ti/iogiogr 1/2 



IS 



«( E ^)- E »( E '-^)-n E ^^^^ 



p<'pl/loglogT ^ l<A:<logT \ p<Tl/loglogT ^ p<'pl/loglogT ^ 

^»( E ^)^ E s(«( E "'^''";::^"'"'' ) 

p<7-i/ log log T ^ l<A:<logT \ p<Ti/iogiog7' 



This is the chain decomposition that will drive our argument. Indeed, unless there is 
some k such that max^g^^. 3^( X]p<T viogiog^ 1/2' ) > k^''^2~^ (say) the contri- 

bution from Ylii<k<\ogT ^^^ clearly be 0(1), and therefore for any m G M we have 



(max 3f? > — - 



PI max gfj> ■Y!:Pi^>u 



< P 



1 „<rl/loglogr ' ^ yl/loglogT ^ 

\ <P<T / 



+ 



^ E eW«( E "'"'"' -;;'""""' ) >^"^-i- (3,1) 

l<k<logT heHk \ p<Ti/iogiogT ^ y 

Next, for any h G "Hfc-i the sum 3f?(X]p<ri/i°giogT 1/2' ) vanishes, and for 

any h G l-ikX^k-i it has mean zero and variance 

Y: m ^^'^'^^'^lf'^''"'^ ? X E -E(^^,(p---p--"'-^'))^ 

p<yl/loglogT ^ p<yl/loglogT ^ 



11/ log log T ^ 

>^ 1 log^ p 1 



p22Mog^r (2MoglogT)2' 

p<Tl/loglogT -f^ 6 V to to y 

since |1 -p^('^-'^''~")| X |1 -pV{2'=iogT)| for ^11 /^ ^ nk\Hk-u and ^p<^. ^^ x log^x. 
Using Tail Inequality 1 (with B = 100/(2^^ log T), say), it follows that for all h G Tik, 

P U{ Yl ^^""'^^'v^^^'"''^ ) > ^°-'2-M « exp(-cA:i-«(loglogT)2) « -^, 

where c > is a small absolute constant. Thus the sums in (13. ip are collectively of size 
0(l/log^T), which is more than acceptable for Lemma 1. A similar chaining argument 
can be applied to max/^g-^ ^(^t'^/ logiogT ^p^j- 1/2 ) , but in this case the variance of 
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the pieces is 0{2~'^^) rather than 0(2~^'^(loglogT)~^), and so we can only immediately 
handle large values of k (say log log log T < k < logT). We can deduce that, for any 
M G M, the probability P ( maxg^^^^ i ^ ^p<T 1/2 > m ) is 



< P 



\ <P<T / 



-O' ' 



log'r- 



Finally set u = log log T — (1/4) log log log T + Cv^log log log T, as in Lemma 1, and 

notice if 5R(Ep<Ti/iogiogT ^^) + maXheni,^^,,^,,^T] 3^(ETi/iogiogT<p<r ^^^) >u- 0(1) 
then one of the following must occur: 



then one of the following must occur: 



3^(zJp<Tl/loglogT pl/'2 ) + ^^'^hen[ioi,iaf,iaf,T] 3^(zJTl/loglogT<p<2- pl/2 ) > U — 0(1), 

and also 

max ^{ E ^^TTT^) > (^/2) l^g l^S l^S ^5 

/^eW[logloglogT] y,/,„^,„^^^p^^ P' 

3^(E«<Ti/iogiogT ^^^St^) > m - 0(1) - (0/2) logloglogT, and also 



^p<Tl/loglogT pi 

\/(p,/l). 



max , ^i E ^^^) - ^( E ^) ) > (C^/2)VlogloglogT; 

2-1/ log log T<p<7i ^ J 



/»eW[logloglogTl \ rv log log T<p<r ^'^' 



• 3'^(Ep<Ti/iogiogT ■^^^) > M - 0(1) - (0/2) logloglogT, and also 

^(E ^^T^) > « - (^/2) v^i^gi^gi^ - 0(1). 

p<T P 

The probability of the third event here is at most as large as the probability in the 
conclusion of Lemma 1, so to prove the lemma it will suffice to show that each of the 
other events has probability 0(l/(logTloglogT)). This follows using the independence 
of the sums over p < 7^1/ log log ^ and over j^i/iogiogT <c p <T, and using the union bound. 
For example, the probability of the second event is 

< pU( J2 ^^%^)>t.- 0(1) -(0/2) logloglogT 

\ p<Tl/loglogT ^ 



/iGWllogloglogT] \ Tl/ log log T<p<7^ 



pl/2 



< exp( — - — ) ■loglogT-exp(-c((0/2)VlogloglogT) ) 

log log 1 

<^ log log T ■ exp(— log log T + 20 logloglogT — c(0/2)^ logloglogT), 
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say, where c > is again a small absolute constant, and we used the facts that 
#^ [log log log T] < loglogT and E{^ ETi/i°giogT<p<-r '^^^'^pi'/r^'"^ )^ = 0(1) (and Tail In- 
equality 1, with B = ioo/ri/2iogiogT^_ rjj^ig -g infieed 0(l/(logT loglogT)), provided 
that C > was chosen sufficiently large, and a similar argument (using the fact that 
^(^^Ti/i°gi°gr<p<T 1/2 )^ ~ (1/2) log loglogT) applies to the probability of the ffist 
event. 

Q.E.D. 

Apart from the chaining arguments used to prove Lemma 1, the proof of the upper 
bound in Proposition 2 was essentially just an application of the union bound on the 
scale of 1/ logT. If the random sums 3fJ ^p<T 172 ' ^ '^p<T 1/2 behaved indepen- 
dently when \hi — h2\ ^ 1/logT then one would expect such an argument to be quite 
sharp, since if Xi, ...,X„ are independent then 

n n 

P(maxXi >u) = l-F{Xi <u\/i) = l-TT(l-P(Xi > u)) ^min{l, VP(Xi > u)}, 

l<i<n -'--'- ■^— ' 

i=l i=l 

which follows since 1 — P(Xj > u) ^ exp(— P(Xj > u)). This is why the upper bound 
in Proposition 2, with the subtracted term —(1/4) log loglogT, would be sharp in the 
"independent at distance 1/logT" case, but since the sums 3f^^p<j. 1)2 are actually 
logarithmically correlated we expect their maximum to be a little smaller (since we have 
"fewer independent tries at obtaining a large value"). 

We also note some remarks made by Fyodorov and Keating about the long range 
maximum maxo<f<T |C(l/2 + ^^)| studied by Farmer, Gonek and Hughes [^. At the end 
of §2.5 of their paper [4j, Fyodorov and Keating observe: "the tail of the distribution 
[of — log maxT<t< j'_|_27r |C(l/2 + it)\ + loglogT — (3/4) logloglogT] that we predict for 
much shorter ranges decays like \x\e^ as x — t- —00; that is, the exponential is linear 
rather than quadratic. If this were to persist... it would suggest that ({1/2 + it) may 
take much larger values than... the Farmer- Gonek-Hughes conjecture". They add that 
"there are several reasons for thinking this unlikely" . The foregoing calculations easily 
imply that, in our model, the long range tail decays like a quadratic exponential, and 
similar considerations quite possibly apply to Fyodorov and Keating's random matrix 
model. 

3.2. Preliminary calculations for the lower bound. In this subsection we make 
some preliminary modifications to the collection of random variables 

^-^ Up log(T/p) 

in the lower bound part of Proposition 2. At the end of the subsection we will give an 
overview of the reasons for making these modifications. 
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Firstly let 1 <^ -E = E(T) <^ y/\ogT be a parameter, whose value will be fixed in 
§3.3. (In fact we will end up taking E = ^log log T(log log log T) ^ ) . We claim that 
there exists some < z < E/logT such that at least 1.987r(log T)/i5 of the points 

^ + 7^, 0<2<(27rlogT)/E-l 
logT 

belong to the "good" set "H. Indeed, this follows immediately when we observe that, if 
1 denotes the indicator function, 

pE/logT ^ 

/ "^ '^z+iE/iogTa'hdz > meas{n) - - — — > 1.987r. 

"^° 0<i<(27rlogT)/E-l °^ 

We choose such a value of z and let "H* := nr]{z+iE/\ogT : < i < {2n\ogT)/E-l}, 
a discretisation of the set "H that will be more convenient to work with. 

Now let 2 < y = y{T) <^ g(iogiogr) be a further parameter, whose value will 
also be fixed in §3.3. (In fact we will end up taking y = e'^^°^^°^'^^ (log log log r) ^^ j^^ 
we will explain shortly, for technical reasons (see below, and also the calculations and 
discussion following Lemma 3 in §3.3) we need to remove the primes smaller than y 
from our random sums. To account for the error that arises in doing this we can take 
a very crude approach: for any fixed h G "H* we have 

2 






E I ^1. pylj'tlr^ ] « loglogy « logloglogT, 



p<y 

using the variance calculations in our first appendix (with P = 2 and Q = y), and 
therefore by Chebychev's inequality we have 



^E^ 



U, logiT/p) 



p<y ^ ^ 



> (logloglogT)^/^) < (logloglogT)-i/2 _ ^(^)_ 



Since the random variables (Up)p^y are independent of {Up)y<p<T, we see that to prove 
the lower bound in Proposition 2 it will suffice to show that 

P(max3^ V ^^!^fi^>loglogT-21ogloglogT-(logloglogT)3/4) = 1-0(1). 

hen* ^-^ pl/Z+th JQcr I 

y<P<T ^ ^ 



Y(^h):=^^^^^^^^::^L^i^, hen*. 

Tv l log^(T/p) 

2 ^l 



Next we let 



2 ^^y<P<T p log^ T 

Using the variance calculations in our first appendix (with P = y and Q = T), we 
see the Y{h) are mean zero, variance one, real-valued random variables. Moreover, if 
hi ^ h2 E "H* (so that, in particular, 1/logT < \hi — /i2| < 27r) then the calculations in 
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our first appendix supply the following covariance estimate: 

1 - loglogT loglog.+0(l) lfl/l0gT<|/.i-/.,|<l/l0g^ ^g_2^ 

0(tt — 7— M i — i — ^) otherwise. 

V|?ii-/i2| logy log logT/ 

Here we used the fact, justified in our first appendix, that 

V - ^°f ^Vrl^ = loglogT - log logy + 0(1) > loglogT, 
y^^P log T 

bearing in mind that y <^ 6^'°^'°^"^-* . We also remark that the random variables Y{h) 
are stationary, in other words the covariance WY{hi)Y{h2) only depends on the distance 
\hi — h2\. This is visibly true at the level of the estimates f l3.2p . if one ignores "big Oh" 
terms, and in fact it is exactly true, since we actually have 

1 y^ cos((/ti-fe2)logp) log^(T/p) 

EY{h)Yih) = ' ^'-'-^ — \ 2...;°^'^ , 

^ ^ ^ ' ly^ 1 log^(T/p) ' 

as shown in our first appendix. The exact stationarity of the Y{h) isn't really necessary 
for the analysis in §3.3, (see the author's paper [6], where some very similar random 
variables without this property are treated), but it is quite convenient. 

Finally, using a multivariate central limit theorem (as explained in our second appen- 
dix) we can replace the random variables Y{h), h G H* by Gaussian random variables 
with the same means and covariance matrix, provided that y > log T, say (so that none 
of the summands in the definition of Y{h) is too large relative to ^T-L*). We summarise 
the state of affairs we have reached in the following lemma. 

Lemma 2. Suppose 1 < E < \f\ogT and log^T < y < g{iogiogr)iooo^ ^^^ ^^^ ^* ^^ 

as above. Let Z{h), h G H* be a collection of mean zero, variance one, jointly normal 

random variables with the same covariances as the random variables Y{h) described 

above. Then to prove the lower bound in Proposition 2, it will suffice to show that 

^,^^ ^ loglogT - 21ogloglogT - (logloglogr)3/4 + 1 

F maxz n. > , = 1 —oil). 

'^^n- - v/(l/2)(loglogT-loglogy + 0(l)) 

Our reason for transitioning to Gaussian random variables is because all information 
about their dependencies is contained in their covariances, and many tools exist for 
analysing their behaviour, as we shall see in §3.3. We introduced the parameter y to 
make the random variables Z{h) less correlated at distant values of h (note the factor 
logy in the denominator of the second estimate in (13.21) ). This will be necessary to 
complete the proof of Proposition 2 as stated, although one could obtain a weaker 
result without doing this, at least for the Gaussian Z{h). The parameter E spaces out 
the points in 7i* so that fewer of them are very close together, and therefore fewer of 
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the Z{h) are very highly correlated. As the reader will see in §3.3, this seems to be 
quite essential to obtain a successful conclusion. 

3.3. The lower bound. Our main tool for establishing the lower bound in Proposition 
2 is the following, which is a slight adaptatioiu of Theorem 1 from the author's paper [6]. 

Lower Bound 1 (Harper, 2013). Let {Z{ti)}i<i<n be jointly multivariate normal ran- 
dom variables, each with mean zero and variance 1. Suppose that the sequence is sta- 
tionary, i.e. that 'KZ{ti)Z{tj) = r{\i — j|) for some function r. Finally let u > 1, and 
suppose that: 

• r{m) is a decreasing non-negative function; 

• r(l)(l + 2u^'^) is at most 1. 

Then for any subset M C {1, 2, ..., n}, the probability P(maxjg_v Z(ti) > u) is 

> mf^ mm { 1. ^/iHg) I n * («yr37U) (l + o {-.^^^ 

where ^{z) := (l/-\/27r) Jf^ e^* /"^dt is the standard normal distribution function. 

To get an idea of what Lower Bound 1 says, we will first apply it to the random 
variables Z{h) in Lemma 2 without carefully checking the conditions, and ignoring the 
"big Oh" error terms in the covariances f l3.2p and in the statement of the theorem. 
Afterwards we will explain how to apply the theorem properly. 

Since W is a set of points of the form z + {iE)/\ogT., we can clearly reparametrise 
the {Z{h))h^'n* by integers i, as in Lower Bound 1. Moreover, as remarked in our 
Preliminary Calculations the covariances E,Z{hi)Z{h2) only depend on \hi — /i2|, so 
these random variables are indeed stationary. Thus if X is some set of indices i that are 
contained in an interval of length (log T)/(£' logy), so that \hi — hj\ < l/logy for the 
corresponding points hi = z -\- {iE)/ logT, hj = z + (jE)/ logT, we have by (13. 2 p that 

jE , log((jE/ logT) logT) logijE) 



l-r(j) ■.= l-¥.Z{h)Z{h + 



log T log log T — log log y log log T — log log y ' 



and therefore P(maxjgiZ(/ij) > a/2 (log logT — log logy)) is 

, /i p [{log T)/(i=; logy)] 



Theorem 1 of [6] deals with the case where Af — {1, 2, ..., n}. However, to treat the more general case 
one can simply replace the sum over 1 < m < n in Proposition 1 of [6] by the corresponding sum over 
m gM. 
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To estimate the product, we note that if 2; > 1 then 
^z) = 1 - -^ / e'''/'dt > 1 - -^ / te-''/'dt ' " 



27r j z zyl-n J z z\j2'n 

> exp{-e-''/yz), 

and therefore (since E' ^ 1, so we certainly have ^/2\og(jE) > 1 for all j) 

[s°logyJ / [/logy J _ \ [logT] 

n $(v/21og(jE))>exp 5^ .^ ~ ,.^, > exp(-- 5^ — ==^: 
The sum here is ^ v^log log T, so provided that E' ^ -\/log log T we will have 



P(maxZ(/i,) > v/2(loglogT- log logy)) > (#X) 



logyvlog^ 



i&X "- " V V b b ta byyy^VTT ^log r(loglogT)3/2 ' 

If the set X contains 3> (log T)/(£' logy) points then this lower bound will be "not too 
small", in a sense that will become clear shortly. Note that it is extremely important 
that the parameter E be present here, and be large enough to compensate for the sum 
over j, since otherwise the lower bound becomes smaller by an exponential factor. 

Actually it is not difficult to perform the foregoing calculations rigorously, provided 
we assume that X is contained in an interval of length (log T)/(i^£' logy) for a suitable 
large constant if > 0. To make things rigorous we need to check that the function 

y cos((j£;/logT)logp) log^(T/p) 

TU) = ¥.Z{h)Z{h+^) = ^^-"-^ ^ ,,^.(^^^^ ^°^^^ , 1 < J < {logT)/{KE\ogy) 

l^y<p<T p log2 T 

is decreasing and non-negative; that r(l)(l + l/(log log T — log logy)) < 1; and that the 
"big Oh" error terms in the statement of Lower Bound 1, and in the correlations 03. 2p . 
do not alter the calculations. All of these things are straightforward to check using 
(13. 2p . except for the condition that r{j) is decreasing and non-negative, which follows 
from the discussion at the end of our first appendix (with P = y and Q = T) provided 
that K is large enough and y/logy ^ log log T. 

Again, we shall summarise the state of affairs we have reached as a lemma. 



Lemma 3. Suppose that A/log log T ^ E <^ yJlogT and (log log T)^ ^ logy ^ 
(log log T) ^°'^° , and let Z{h), h G "H* be the corresponding collection of mean zero, 
variance one, jointly multivariate normal random variables from Lemma 2. 

Suppose thatX is a set of integers contained in an interval of length {logT)/{KE\ogy), 
such that #X > (log T)/(2if£' logy) (say) and such that 

K:=z + -^eW V^GX. 
logT 
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Then P(maXjexZ(/ii) > a/2 (log log T - log log y)) > ■^^^^^^, where the implicit 
constant is absolute. 



Now recall, from our Preliminary Calculations, that "H* is a certain subset of {z + 
iE/logT : < i < {2Tx\ogT)/E - 1}, and that H* contains at least 1.987r(logr)/E 
points. Thus if we let 

-IV* 07*0 II r I -77/1 ^ J^ogT (j + l)logT 

0<j<2TrKlogy, °^ *= *^ 

J even 

then "Heven must contain at least 0.957r(logT)/_E' points, say, since the complementary 
set "Hodd (where j runs over odd integers) must satisfy 

#Kdd < {7iKlogy + OilM\ogT)/iKElogy) + Oil)) < 1.037r(logr)/E. 

In particular, we may find sets iIk)i<k<iogy, each a subset of {j(\ogT)/{KE\ogy) < 
i < {j + l)(logT)/{KE\ogy)} for some distinct even j, such that 

{z + iE/\ogT:ieXk}'^n:,,^ and i^Xk> {\ogT)/{2KE\ogy) VI < A: < logy. 

The point of this manoeuvre is that we have good information about P(maxjgx^ Z{hi) > 
^2 (log log T — log log y)) for each set X^, in view of Lemma 3, and moreover the differ- 
ent sets Xfc are sufficiently separated that the random variables Z{hi) corresponding to 
different sets are "almost independento ". (Note that \hi — hj\ > l/{K\ogy) if hi,hj 
correspond to different blocks X^, and recall that the covariances (13. 2p decay rapidly 
between distant points hi,hj.) 

More precisely, let us write raaxk-ifzx,, Z (hi) to mean maxj,°^^ rnaxjgjj. Z(/ij). Then 
we obviously have that P(max/ig^* Z{h) > a/2 (log log T — log log y)) is 



> P(max Z{hi) > A/2(loglogT — log logy)) 



k;iGXk 



[logy] 

1 - 17 P(maxZ(/i,) < v/2(loglogT- log logy)) 



k=l 



[logy] 

+ W P(maxZ(/ii) < v/2(loglogT- log logy)) - P(max Z{hi) < v/2(log log T - log logy)). 



5i 



This is the basic reason for introducing the parameter y in the first place: we will use the blocks 
(Xfc)i<fe<iogy to convert the probability lower bound in Lemma 3, which is fairly large but still o(l), 
into an overall lower bound 1 — o(l). There are general concentration inequalities for suprema of 
Gaussian processes that could also be used for this, but would give a weaker result in Proposition 2. 
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and in view of Lemma 3 this is 



> 1-1 



Cy/\0g E \ 



logy 



E(log\ogT)^/y 

[log y] 

+ TT P(maxZ(/ii) < v/2(loglogT- log logy)) - P(max Z{h,) < v/2 (log log T - log logy)), 



k=l 

where c > is a small absolute constant. In particular, provided that 



VlogElogy 

— )■ oo as i — !■ oo 



E (log log T) 3/2 

then the second term here is o(l). 

To estimate the difference in the final line we can use the following result, which 
is one of a family of normal comparison inequalities that bound a difference between 
multivariate normal probabilities in terms of differences of the covariance matrices of 
the relevant random variables. This particular result is due to Li and Shao [8]. 

Comparison Inequality 1 (Li and Shao, 2002). Let (Xi,X2, ...,X„) and (Wi, ..., Wn) 
each be a vector of mean zero, variance one, jointly normal random variables, and write 
r\ ■ = KXiXj and r\- = KWiWj. Let ui, ...,Un be any real numbers. Then 



\¥{Xj < Uj VI < j < n) - F{Wj < Uj VI < j < 



n] 



< — y |arcsin(rJV)-arcsin(r(°))|e-("?+^')/(2(i+-^^^l<il'l^'°'l»). 
27r ^-^ 

l<i<j<n 

The difference that we want to bound is of the kind treated by Comparison Inequality 
1, where Uj = a/2 (log log T — log log y) for all j, the Wj are simply the random variables 
{Z{hi))i<k<iogy,i<=:X,,, and the Xj are the same but with the covariances KZ (hi-^) Z (hi^) 
replaced by zero when ii,i2 do not belong to the same block X^. (Since our random 
variables are jointly normal, this is equivalent to saying that the Xj are the same as the 
{Z{hi)) except that they are independent in different blocks X^, hence the probability 
factors as a product over k.) Thus we find 

[log y] 
I TT P(maxZ(/i,) < v/2(loglogT- log logy)) -P(max Z{h,) < v/2(log log T - log logy)) | 

J- -•■ ieXk k;ieXk 

k=l 



^ ^ E EEi^^^^i^(^^(^^)^(^^))i^" 



-2{loglogT-loglogy)/{l+\EZih,)Z{h,)\) 

2n ' ' ' ' 

i<fc<KiogyieXfeieXi 



,2 



< 



log y Y^ >r^>r^ 1 



E EE 



^ l<k<l<logyieXk j&Xi ■" 



where the final line uses the estimate (13. 2p for the correlations, and we note in particular 
that \'KZ{hi)Z{hj)\ <^ 1/loglogT provided \hi — hj\ ^ 1/logy, as is the case when 
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i G Zk,j G Zi by construction of the blocks X^. Since #Xyt < (log T)/(£' logy) for all k, 
and \hi — hj\ ^ |/ — k\/\ogy for all i G Xk,j G X;, the above is 

J_ v^ 1 logy log logy 

E^ ^ (l-k)\og\ogT EHoglogT' 

l<k<l<logy V ^ & <= & <= 

Finally, choosing i? = i/log log T(log log log T)^ and logy = (log log T)^ (log log log T) 
say, we have 



2 



) 



VlogElogy logyloglogy 

— )■ oo and — — - — ; — — — )■ U as i — )■ oo, 



E (log log T) 3/2 E2 1oglogT 

and therefore we have 

P(max Z{h) > v/2(log log T - log logy)) > 1 - o(l). 
hew* 

In view of Lemma 2, this suffices to complete the proof of Proposition 2. 



Q.E.D. 



Appendix A. Covariance calculations 

In this appendix we perform some variance and covariance calculations, that are 
necessary for the probabilistic arguments in §3 but are really just estimates for various 
sums over primes. 

For any fixed 2 < P < Q < T, let us write 

P<p<Q ^ ^ 

Then we have 
^^ .,^2 V- E^iU,,p^^'niU,,p^^') log(T/pO log(TM) 



E 



E(l/2)(f/p,p-'' + U,,pf){l/2){Up,p^"^ + U,,pf) log(TM) log(r/p2 



— »^/^o^/^ log^ T 

P<PUP2<Q Pi P'i ^ 

^ \ ^ l \og\T/p) 
2 ^ p log^T ' 

since KUpj^Up^ = for all pi,P2, and KUp^^Up^ = unless pi = P2, in which case E|^p|2 = 
1. It is a standard fact (see e.g. Theorem 2.7 of Montgomery and Vaughan pj) that 

V- = loglogx + 6 + I 1, Y] <loga;, Y] <log^x, 

^-^ p \ log X J ^-^ p ^-^ p 
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for a certain constant b, and therefore we have 



\P<p<Q P<p<Q ^ P<p<Q 

= (l/2)(loglogQ-loglogP + 0(l)). 

Turning to covariances, the same calculations as above show that 

EX (h)x (h) = V ^'^(UprPi'''MUp2P2''') iog(^/pi) log(rM) 

P,Q{ l) P,Q{ 2) 2^ 1/2 1/2 loff^T 

P<Pi,P2<Q PI P2 ^ 

1 ^^ cos((/ii — /i2) logp) f^ 21ogp ^ log^p 
~ 9 2^ 



''p^<Q P V' logT+log^T 

More explicitly, by a strong form of the prime number theorem (see e.g. Theorem 6.9 
of Montgomery and Vaughan [5]) we have 

Tr(z) ■=j^{p<z:p prime} = I -^ + Oize-'^^^^^'), z>2, 

J 2 logW 

where li > is an absolute constant. Therefore, for any a 7^ we find J2p<p<Q cos(a logp)/p 
is 

"^ cosia log u) 

a7r(M) 

p u 



r^^^^^^^rfn + 0((l + H)e-v^) 
Jp uiogu 



palogQ „„„„, ^ 

/ ^2^dv + Oiil + laDe''^^^) 

la log P "^ 

loglog(5-loglogP + 0(l) if|alog(5|<l 

log(l/|alogP|) + 0(1) if ^ < |tt| < ^ 

0(l/(|alogP|) + (1 + |a|)e-^^i°s^) otherwise, 

the final line using the estimate cosf = 1 + 0{v^) when \v\ < 1, and integration by 
parts on the rest of the range of integration. Similar calculations show that 

-^ cos(alogp)logp -^ cos(alogp)log^p nn/n^i 1 ^T-h^n^i w -dVWP\ 
2^ ^b^T ' ^ ^1^?^ = 0(l/(l+|alogT|) + (l+|a|)e ), 

and therefore we have 

r (l/2)(loglogQ-loglogP + 0(l)) ii\h,-h2\<^ 

EXp,Q{h)Xp,Q{h2) = I (l/2)(log(l/|/ii - h\) - loglogP + 0(1)) if i^ < \hi - h] < ^ 

[ 0(l/(|/ii - h2\ logP)) if lip < 1^1 - ^2| < 27r. 

Finally we make a few more qualitative observations about EXpq(/ii)Xpq(/i2). 
Firstly, the foregoing calculations show that KXp^Q^hY doesn't depend on h, and that 
KXp^Q{hi)Xp^Q{h2) is a function of \hi — /i2| (in other words the random variables 
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Xp^QiJi) are stationary) . Secondly, there exists an absolute constant K > Q such that 

1 



EXp,q(/Ii)Xp,q(/12) > if |/ii - /l2| < 



K\ogP 

And thirdly, if we write rp^Q{h) := KX p^Q{hi)X p^Q{hi + h), (which doesn't depend on 
hi, because of stationarity), and if < 5 < /i < 27r and ^/[ogP ^ log log Q, then we 
have 

.. TN 1 /•("+') ^°s^ cost;, 1 /'('^+'^)i°^«cost;, ^, 1 , /j-p^ 

rp,Q{h)-rp,Q{h + 8) = - ^^-o/ dv + 0{--— + e-'''^^^) 

2AiogP V 2AiogQ V h\ogT 



1 /■('^+^)i°^^cost;^ ^, 



2AiogP ^^ /ilogQ 

using integration by parts. In particular, ii h + 6 < 1/ log P (say) then the integral here 
is > jTiogp"^^ Idv > 6/h, and so if 5 > K/ logQ then 

rp,Q{h) -rp^Q{h + 6) > 0. 

Appendix B. A multivariate central limit theorem 

In this appendix we formulate a version of the central limit theorem that justifies 
Lemma 2, above, in which the random variables Y{h), h G H* were replaced by normal 
random variables Z[h) with the same means and covariances. 

In fact, we will sketch a proof of the following theorem. 

Central Limit Theorem 1 (Specialised from Theorem 2.1 of Reinert and Rollin [lOj). 
Suppose that n > 1, and that Ti is a finite non-empty set. Suppose that for each 
1 < i < n and h ^ Ti we are given a deterministic coefficient c{i,h) G C. Finally, 
suppose that (Vi)i<j<„ is a sequence of independent, mean zero, complex valued random 
variables, and let Y = {Yh)hen be the ^Ti-dimensional random vector with components 

n:=5i(X^c(^,/i)V,J. 

If Z = {Zh)h(z-}i is a multivariate normal random vector with the same mean vector 
and covariance matrix as Y , then for any m G M and any small 6 > we have 

P(max Yh<u) < P(max Zh < u + 6) + 

h&H hen 




n n / 

g,h£H \ i=l i=l \heH 



c{i,h)\ 



There is an exactly similar lower bound for ¥{m.a.Xh^-}i Y^ < u), in which P(max/ig-^ Z^ < 
u + 6) is replaced by P(max/ig-^ Z^ < u — 6). 
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Before discussing its proof, let us see how Central Limit Theorem 1 is applicable to 
the random variables {Y{h))h^-}i* considered in §3. Recall that, by definition, 

""- Z^y<p<T pl/2+ih logT 



Yih) :- 



1 V l log^(T/p) 

2 Z^i 



2 l^y<p<T p iog2 T 

SO we are in the setting of Central Limit Theorem 1 with the indices i replaced by primes 
y < p <T, and Vp = Up, and 



1 iog(r/p) 1 iog(T/p) 

I 

c(p, h) 



pi/2+ih logT pi/2+ih logT 



Tv 1 ^osHt/p) y(l/2) (log logT - log log y + 0(1)) 

2 ^y<p<T p log^T 

Now Central Limit Theorem 1 implies that P(max/jg^* ^(/i) < n) is at most 



l*\2 I 1 ( -U-1J*\'i 



y<p<T ^ ^ ° ° / y<p<T ■ 



P(maxZ(/.)<n+5)+0U^^ / V ^^^^ + i^^ V 3,,,, ^,.,,, 

View* I "^ V p2 (log logT) 2 53 ^^-^ p3/2 (log logT) 3/2 



Provided that y > ((#'H*)/(5)^, say, the "big Oh" term here is o(l) as T — )■ 00. Choosing 
M = (log logT-21og log logT-(log log logT)3/4)/(v/(l/2)(log logT -log logy + 0(1))), 



and 6 = 1/ A/log logT, we conclude that 

P(maxF(M > log log ^ - 2 log log log T - (log log log Tf/^ ^ 
'^ew* v/(l/2)(loglogT-loglogy + 0(l)) 

. ™. ^,,, ^ log logT -2 log log logT -(log log log T)3/^ , 1 ^ , 

> Pmaxzn, > , 1 + oil), 

'^e^* v/(l/2) (log logT - log logy + 0(1)) VloglogT' 

provided that y > (loglogT)3(#-H*)^ Since #?{* < i\ogT)/E, this condition is cer- 
tainly satisfied if y > log T, say, and so Lemma 2 follows. 

We will deduce Central Limit Theorem 1 from a much more general normal approx- 
imation theorem of Reinert and Rollin [TO], which they prove using Stein's method of 
exchangeable pairs. If one only wants a result like Central Limit Theorem 1 this could 
probably be deduced from many other existing results as well, but since Reinert and 
RoUin's result is neat and powerful, and automatically supplies explicit "big Oh" terms, 
we will work from there. See also the second appendix in the author's paper [B], where 
Reinert and RoUin's result was applied to a very similar problem. 

To apply Theorem 2.1 of Reinert and Rollin [10], we first need to construct a random 
vector Y' = (Y/^)hen such that the pair (Y, Y') is exchangeable, i.e. such that {Y, Y') has 
the same law as {Y', Y). In fact there is a standard way to do this: let J be a random 
variable independent of everything else, having the discrete uniform distribution on the 
set {l,2,...,n}, and let (V^')i<i<n b^ independent random variables having the same 
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distribution as the Vf, then conditional on the event I = i, define 

Yl ■.= Yh-^ (c(2, h)Vi) + ^ (c(«, h)V^) , hen. 

Since there is perfect symmetry between the roles of Vi and V/, the reader may readily 
convince themselves that {Y, Y') form an exchangeable pair. 

Next we shall perform a few conditional expectation calculations we will need, be- 
ginning with the calculation of E{Y' - Y\Y) = E{{Y/^ - Yh)hen\{yh)hen)- Since / is 
independent of Y, and is distributed uniformly on {1,2, ...,n}, we have 

EiY'-Y\Y) = ^iE(F'-F|F,/ = i) 

n 

= V -E((-3? {c{t, h)V,) + 3? (c(2, h)V:)),en\{Yh)hen, I = ^) 
^-^ n 

i=l 



=1^ 



the final line using the fact that VI is independent of Y and /, and has mean zero. Now 
we observe that (— 3f^ (c(i, h)Vi))heH is independent of /, and so we have 

E{Y'-Y\Y) = Y,-n{-'^{c{i,h)V;))n^n\{yH)HeH) = -E(5^(-5R (c(^, /.)\/,)),,^|(n),e«: 

= —E{{Yh)henm)hen), 
n 

where the second equality uses linearity of expectation, and the third equality uses the 

definition of Y^. We conclude that 

ECY' - Y\Y) = —Y. 
n 

In a similar way, for any g.,h El-L we have 

E((F;-r,)(r^-n)|r) 
1 " 

= -E( V(-5R {c{i,g)V.i) + ^ {c{i,g)V:)){-^ (c(^, h)V;) + 3? (c(^, h)V:))\Y) 
n ^-^ 

^ n _ n 

= -E{J2 ^ {c{t, gm 5R {c{t, h)V) \Y) + -E(^ 3? (c(z, ^)\//) 3? (c(z, h)Vl)\ 
1=1 1=1 

the second equality using the fact that each ¥( is independent of Vi and Y (and has 
mean zero). Also, for any f,g,h ^Ti we have 

E\iY}-Yj)iY^-Y,)iY;-Y,)\=J2-nMhf)iV:-VMMh9)iV:-V^^^^^ 



■ 1 ^ 
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Now Theorem 2.1 of Reinert and Rollin [TU] asserts that, if t : M*^ — )■ M is any three 
times differentiable function, 

\Et{Y)-Et{z)\ < Usnp WTT^tiu) E ^\/'^^'{my^-y9)iyH-yk)\Y)) + 

Here the factor n is the reciprocal of the 1/n arising in the condition E(y — F |y) = —Y. 
To understand the terms in this bound, we note that the second sum in our expression 
for E{{Y'^ — Yf^i^l^ — F/j)|F), above, is deterministic (it is just an expectation, rather 
than a conditional expectation), so can be ignored when computing the variance of 
E((F; - Yg){Jl^ - Yh)\Y). Thus we have 

Var(E((F;-F,)(F^-n)|F)) = N^x\^i^'R{c[i,g)Y^n{c{i,K)V,)\Y) 

= ^Var f E(^ ^ {c{%, g)V>) ^ {c{i, K)V>) \Y) 
< ^Var I J2 ^ (c(^> 9)Vi) ^ (c(z, h)Vi) 



i=l 



where the final line uses the fact that conditioning reduces variance. At this point, since 
the Vi are independent we conclude that Var (E((Yg' — Yg){YI^ — Yh)\Y)) is 

1=1 j=i 1=1 

Rather more straightforwardly, our expression for E| {Y'^ — Yf){Y^ — Yg) (F^ — Yh) \ , above, 
together with the fact that E|V^/|^ = E|V^j|^, imply that 



E 



|(F;-F;)(F;-F,)(r^-F.)|«X^^|c(^,/)||c(2,<7)||c(^,/i)|E|y,p. 



1=1 



Finally we can set t{{xh)heH) '■= Ylhen ^i^h), where s : M — )■ [0, 1] is any three times 
differentiable function such that 

I 1 if X < M 

s(x) = < 
^ ^ [0 iix>u + 6. 

We can find such s with derivatives satisfying |s'''~''(x)| = 0{5~^), < r < 3, in which 
case we will have 

^^p 11^ — a — ^lloo^i^ ^ sup ||- — - — - — t\\^<t:s '\ 

g,heH OXgdXh f,g,heH OXfdXgdXh 
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Then we see 

P(max Yh<u)- P(max Zh < u + 6) 
hen hen 



< Et{Y) - Et{Z) 

-y 



g,he'H 



\ 



i=i i=i \hen J 



which is the upper bound claimed in Central Limit Theorem 1. The lower bound follows 
by instead choosing s{x) to be 1 if x < u — 5, and if a; > u. 

Q.E.D. 
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